Systems genomics of salinity stress response in rice

Populations can adapt to stressful environments through changes in gene expression. However, the role of gene regulation in mediating stress response and adaptation remains largely unexplored. Here, we use an integrative field dataset obtained from 780 plants of Oryza sativa ssp. indica (rice) grown in a field experiment under normal or moderate salt stress conditions to examine selection and evolution of gene expression variation under salinity stress conditions. We find that salinity stress induces increased selective pressure on gene expression. Further, we show that trans-eQTLs rather than cis-eQTLs are primarily associated with rice’s gene expression under salinity stress, potentially via a few master-regulators. Importantly, and contrary to the expectations, we find that cis-trans reinforcement is more common than cis-trans compensation which may be reflective of rice diversification subsequent to domestication. We further identify genetic fixation as the likely mechanism underlying this compensation/reinforcement. Additionally, we show that cis- and trans-eQTLs are under different selection regimes, giving us insights into the evolutionary dynamics of gene expression variation. By examining genomic, transcriptomic, and phenotypic variation across a rice population, we gain insights into the molecular and genetic landscape underlying adaptive salinity stress responses, which is relevant for other crops and other stresses.


INTRODUCTION
Plants face numerous stresses that reduce their growth and fitness, and they have a variety of adaptations to help them deal with environmental challenges (Mareri et al., 2022).For crop plants, stresses such as drought, heat and salinity can greatly limit productivity and agricultural sustainability worldwide (Fahad et al., 2017;Kopecká et al., 2023); indeed, as a direct result of various abiotic stresses, an estimated ~ 50-70% of crop yields are lost (Francini and Sebastiani, 2019).While there is a wealth of information on the physiological responses of plants, including crops, to stress (H.Zhang et al., 2022), we are still developing an understanding of the underlying genetic responses to stress and the complex interactions between stresses and environmental cues, networks of gene expression regulation, physiological responses and ultimately plant fitness (Cramer et al., 2011;Siddiqui et al., 2021).
Much of the research on the genetic basis of stress responses in crops has focused on identifying individual genes associated with specific responses and/or tolerance traits (Cui et al., 2020;Jiang et al., 2019;Kumar and Wigge, 2010;Laohavisit et al., 2013;Yuan et al., 2014).
Stress responses, however, are complex traits that can be influenced by multiple genetic pathways with numerous interconnected genes.Expression levels of genes are controlled by regulatory elements and are often environmentally induced, suggesting a critical role of regulatory divergence in adaptive evolution (De Clercq et al., 2021;Wilkins et al., 2016).Furthermore, changes in gene expression can lead to changes in the transcript network correlational structure, reshaping regulatory pathways (Lea et al., 2019) and influencing the way in which gene expression variation can influence adaptation to stressful environments (Signor and Nuzhdin, 2018;Wagner and Lynch, 2008).As such, there is an increasing need to understand how evolutionary dynamics of gene expression and transcript abundance relate to the genetic underpinnings of stress responses and adaptation in crops (Groen et al., 2020;Ruffley et al., 2023;Siddiqui et al., 2021).
One of the most important stresses for many plants is salinity.Soil salinity causes osmotic imbalance between the plant and the soil, which impedes the uptake of water and other key nutrients (Hakim et al., 2014;Munns, 2002), possibly leading to acute ion toxicity (Liang et al., 2018).While some plants are considered halophytic and can thrive in saline environments, other plants are highly sensitive to salts and experience negative effects of salinity even at low concentrations (Carillo et al., 2011).Oryza sativa (Asian rice) is one such salt-sensitive crop, Selection on gene expression.To identify transcripts associated with high fitness in normal and saline environments, we measured the strength of selection on gene transcript levels.We did this using phenotypic selection analysis (Lande and Arnold, 1983), taking the total number of filled rice seeds/grains (total fecundity) as a proxy for fitness (Groen et al., 2020).We estimated the linear (S) and quadratic (C) selection differentials, which are estimates for directional (negative or positive S) selection, and stabilizing (negative C) or disruptive (positive C) selection.We calculated the raw (S and C), variance-standardized (S  and C  ), and mean-standardized differentials (S  and C  ).We found that both mean and variance of transcript expression vary significantly between conditions (Mann-Whitney U-test, mean and standard deviation: P < 2.2 x 10 -16 ), which means that the interpretation of the strength of selection based on either of these differentials could be misleading (Supplementary Table 2).To overcome this, and given that we were interested in selection on gene expression, which were all measured in the same units, we used the non-standardized raw selection differentials for all downstream analyses.We also report the mean-and variance-standardized selection coefficients (Supplementary Table 3).
Most transcripts were (nearly) neutral (|S| < 0.1) in both the normal and saline environments (Fig. 1c).Additionally, within the normal environment, there was a general trend towards stronger positive compared to negative selection on gene expression, indicating that higher expression of transcribed genes is associated with greater fitness (Table 1, Mann-Whitney U-test, normal: P = 3.95 x 10 -15 ).In comparison, though the strength of positive selection was higher than that of negative selection in the saline field, higher expression of most of the transcribed genes is associated with lower fitness (Table 1, Mann-Whitney U-test, saline: P = 0.0068).Although no transcripts cleared the Bonferroni correction threshold in saline conditions, under normal conditions seventeen transcripts cleared the threshold, thirteen of which were under positive directional selection (Supplementary Table 3).
We also found that a high proportion of transcripts experienced stabilizing selection (C < 0) in both normal and saline environments.This effect was pronounced under normal conditions, with the strength of stabilizing selection being stronger than that of disruptive selection (Table 1; Mann-Whitney U-test, normal: P < 2.2 x 10 -16 ).In contrast, within the saline environment, there was no detectable difference between stabilizing and disruptive selection (Mann-Whitney U-test, normal: P = 0.514), due to a higher proportion of transcripts experiencing stronger disruptive selection.This indicates that in the saline environment, as compared to normal conditions, an increase in fitness is associated with extremes in transcript abundance.
Comparing the distribution of selection differentials between environmental conditions, we found that selection was stronger in the saline field compared to the normal wet paddy field (Table 1).This was true for the overall strength of directional selection (Mann-Whitney U-test, P=0.012), as well as for negative (P=8.27 x 10 -6 0.009), stabilizing (P < 2.2 x 10 -16 ) and disruptive (P < 2.2 x 10 -16 ) selection, but not positive selection (P=0.735).

Gene expression trade-offs.
Next we compared the proportions of genes showing an opposite direction of selection in across environments (antagonistic pleiotropy-AP) and those showing selection in only one environment (conditional neutrality-CN) (Anderson et al., 2011).To account for the inherent bias associated with the detection of CN, we used a more stringent Pvalue cutoff to define CN (P < 0.025) in comparison to AP (P < 0.05) (Groen et al., 2020;Siddiqui et al., 2021).We found that 10.80% of the transcripts showed selection patterns consistent with CN, while only 0.28% of transcripts showed AP (Fig. 1f; Supplementary Table 4).The proportion of transcripts showing CN was greater than expected by chance and greater than the proportion showing AP (two-tailed proportion z-test P < 2.2 x 10 -16 ), even accounting for the fact that there is an inherent bias favoring detection of CN (Anderson et al., 2013).These results are consistent with a lack of trade-offs in which the expression of a gene is favored in one environment and disfavored in another.
Among the 51 AP transcripts, increased expression of 38 was beneficial only in normal conditions (higher expression associated with higher fitness in normal conditions, but lower fitness under saline conditions).Gene ontology (GO) term analyses of these 38 transcripts indicated that many of these genes were involved in photosynthesis and metabolic processes (Supplementary Fig. S1; Supplementary Table 4).This is consistent with an observed reduction of photosynthesis in rice under salinity stress (Radanielson et al., 2018;Tsai et al., 2019).
We then wondered whether any difference underlie gene regulation of these AP transcripts relative to the non-AP transcripts, potentially indicating the genetic basis for this trade-off.We identified single nucleotide polymorphisms (SNPs) associated with transcript expression levels in each environment separately using whole-genome polymorphism data (expression quantitative trait loci [eQTL] analyses; see below).We found SNPs regulating expression of two photosynthesis related AP transcripts (PSAN and CRR7) only in the normal environment (Supplementary Fig. S2).We further looked at the 13 AP transcripts that were beneficial only in the saline environment (Supplementary Table 4).Although there was no significant GO enrichment for these transcripts, among them we identified a cyclophilinencoding transcript (OsCYP2), which has been shown to confer salt tolerance in rice (Lee et al., 2015;Roy et al., 2022;Ruan et al., 2011).However, no SNP was associated with the expression of this gene in either normal or saline conditions.Biological processes under selection.To investigate the broader biological processes associated with differential selection (strong selection in only one environment), we ranked all GO biological processes by their median selection strength in each environment and identified the processes with significantly stronger selection relative to their respective environment.We identified 13 and 18 processes that were under strong differential selection in normal and saline conditions, respectively (Supplementary Table 5; Fig. 2a).Processes primarily involved in various aspects of growth and defense were under stronger selection in normal conditions, whereas processes associated with regulation of flowering, cell cycle control and reproduction showed stronger selection under saline conditions (Fig. 2a).This provides insight into specific biological processes involved in flowering time regulation and reduced yield associated with salinity stress, as has been observed in multiple species (Chang et al., 2019;Li et al., 2007;Van Zandt and Mopper, 2002;R. Zhang et al., 2022).Furthermore, studies have found that the osmotic stress induced by salinity causes a reduction in the cyclin-dependent kinases (CDKs) responsible for cell-cycle transitions (G1/S and G2/M) (Ma et al., 2015;Schuppler et al., 1998;West et al., 2004).Aligned with this, our study also supports the notion that salinity stress affects cell cycle regulation, and leads to reduced growth and reproduction.
Gene expression usually operates within the context of robust gene interaction/regulatory networks (Amiri et al., 2018;Israel et al., 2016;Ko and Brandizzi, 2020).The fact that genes interact in networks means that selection on one gene can lead to indirect selection on interacting genes, which can constrain the response of a population to selection on gene expression (Agrawal and Whitlock, 2010;Groen et al., 2020;Kondrashov and Houle, 1997).To examine this phenomenon, we identified suites of correlated transcripts using principal component (PC) analysis on genome-wide gene expression levels.We then estimated the linear (β) and quadratic selection () gradients for PCs explaining over 0.5% of variance in each environment (Supplementary Table 6).Although quadratic selection was generally weak, linear selection on some PCs showed significant directional selection (Supplementary Table 7).
Using the breeder's equation (Falconer and Mackay, 1996), we predicted the response to selection on these PCs to examine the constraints on a population's microevolutionary response to selection on gene expression.We found that over half of PCs (7 of 11 and 7 of 12 PCs in normal and saline conditions, respectively) displayed opposite signs of direct and indirect responses to selection (Supplementary Table 7).This finding contrasts with prior work in rice under dry and wet conditions that found a lack of constraint in response to selection for most traits except seed size because direct and indirect responses to selection were largely similar (Ćalić et al., 2022), indicating that different types of stress may either constrain or facilitate the response to selection.
We found PCs enriched for metabolic pathways and biosynthesis of phenylpropanoids and secondary metabolites to be under selection in the normal environment (Fig. 2a; Supplementary Fig. S3; Supplementary Table 7).Different transcripts involved in these pathways were under positive and negative selection which may act to keep these pathways in a steady-state.Interestingly, circadian rhythm was found to be under positive selection, with an overall positive response to selection, in the saline environment (Fig. 2a; Supplementary Fig. S3; Supplementary Table 7).This is in alignment with the role of circadian clock genes in conferring salt tolerance (Kim et al., 2013;Wei et al., 2021;Xu et al., 2022), and indicates a tentative increase in expression of circadian clock genes with continuous exposure to soil salinity.
Salinity stress induces decoherence.Gene expression levels are generally correlated, but these correlations can be perturbed by environmental stresses, a phenomenon that is termed decoherence (Lea et al., 2019).Such decoherence has been demonstrated in humans and primates (Lea et al., 2019;Pu et al., 2022;Watowich et al., 2022), but little is known about how stress alters the correlation structure of transcript levels in plants.To examine decoherence in rice, we utilized the recently developed CILP (Correlation by Individual Level Product) method (Lea et al., 2019), which detects the systematic loss of correlation in gene expression among individuals.
Since CILP calculates product correlations for all possible pairs of genes, we used transcripts with selection strengths greater than 0.1 (|S|> 0.1) in at least one environment with expression greater than 0 in at least 50% of individuals (2,051 transcripts; Supplementary Table 8) to reduce data dimensionality.
We found that the correlation structure for gene expression was broadly similar across the 2.10 M unique transcript pairs, but > 29,000 transcript pairs (involving 1,742 unique transcripts) showed significantly different correlations between normal and saline conditions (FDR < 5%) (Fig. 3a; Supplementary Table 8).This change in correlation structure indicates possible divergence in gene interactions between environments, which may presage a restructuring of gene networks.GO term enrichment analysis of the transcripts that show decoherence with significant pairs greater than the median (median significant pair per transcript = 12, n = 853) highlighted important pathways related to plant responses and potentially tolerance to excess salt (Fig. 3b).For instance, circadian rhythm genes like OsPRR37 and GIGANTA (GI), which have been shown to confer salt tolerance in rice and Arabidopsis (Kim et al., 2013;Wei et al., 2021), differed significantly in their interactions between the two environments.Similarly, to manage the energy requirements of salt stress, the relative abundance of metabolites involved in energy producing pathways like glycolysis, tricarboxylic acid (TCA) cycle and various other metabolic processes have been shown to be altered as an early response to salt stress (Che-Othman et al., 2017;Jacoby et al., 2011;Yang and Guo, 2018).Additionally, aligned with our results, multiple studies have reported a positive correlation between salt tolerance and levels of secondary metabolites and amino acids with osmoprotectant properties associated with lowering osmotic stress (Krishnamurthy and Bhagwat, 1990;Petrusa and Winicov, 1997;Tari et al., 2010).Our results suggest that salt exposure induces decoherence of gene expression in some transcripts, that this decoherence results from a restructuring of the gene expression network, and this restructuring can allow salt stress tolerance, providing a potential molecular mechanism underlying this tolerance.

Selection on traits.
In addition to gene expression data, we also collected phenotypic data for 13 organismal traits in normal and saline conditions, which provides us with the opportunity to examine how salt stress influences complex phenotypes and identify connections between variation in traits with fitness consequences and underlying patterns of gene expression.Since these traits were measured on different scales, we estimated variance-standardized selection gradients on these traits, focusing only on seven traits that were not strongly correlated to limit the contribution of indirect selection (Pearson correlation coefficient < 0.6; Supplementary Fig. S4).We identified three traitsleaf osmotic potential (LOP), chlorophyll a content (Chl_a), and flowering time (FT)that displayed different selection patterns in normal versus saline environments (Fig. 4; Supplementary Table 8).
Leaf osmotic potential is a key trait associated with water transport in plants, and which decreases with increasing salinity, leading to low water uptake by the plant.This trait was under positive selection in the control environment but under stabilizing selection under saline conditions, which suggests that an optimal LOP is important under salt stress (Fig. 4).Furthermore, we found Chl_a content at the reproductive stage to be under negative selection in saline conditions.Although ion toxicity has previously been shown to reduce chlorophyll content (Ashraf and Bhatti, 2000;Taïbi et al., 2016), our results showed that this reduction is associated with increased survival and reproductive fitness, consistent with the general trend for reduced photosynthesis under salinity stress.We also found that FT was under positive selection (selection for later flowering) in normal conditions but under stabilizing selection in saline conditions (Fig. 4).Moreover, FT was significantly reduced under saline conditions (two-tailed paired t-test P = 0.0012; Supplementary Fig. S5), implying that salt stress selects for an optimal flowering time, with the trend shifted towards earlier flowering compared to that in the normal wet paddy.

Genetic architecture of gene expression variation.
To dissect the genetic architecture of gene expression variation, we identified expression quantitative trait loci (eQTLs) and examined whether and how selection acts on these eQTLs.We identified cis-and trans-eQTLs (FDR < 0.001) regulating expression of 3,065 and 3,277 genes in normal and salinity stress conditions, respectively (Supplementary Table 9).We observed that 49.64% (29,623 of a total 59,669) of cis-eQTLs were common in both environments, as compared to 18.62% (25,528 of a total 137,100) of trans-eQTLs; this result was robust to FDR cutoff (FDR of 0.01 and 0.05).
Moreover, this is consistent with previous observations from studies on other species that have found 48%-77% overlapping cis-eQTLs and 9%-60% common trans-eQTLs across environments (Smith and Kruglyak, 2008;Snoek et al., 2012;Sterken et al., 2023).It further indicates that trans-eQTLs might be more environment-specific than cis-eQTLs.We tested this explicitly by identifying loci showing gene-environment interaction (G × eQTL) and found that at FDR of 0.05, cis-eQTLs constituted merely 0.28% (142 of 50718) of the total identified G × eQTLs.
We identified eQTL hotspots, which are regions of the genome that are associated with expression variation of a large number of genes (Qu et al., 2018).These regions can occur either due to low amounts of recombination (high linkage disequilibrium-LD) or because they contain master-regulators that pleiotropically control expression of multiple functionally-associated genes (Hammond et al., 2011;Kliebenstein, 2009;West et al., 2007).To account for LD and identify regions likely to contain master-regulators, we chose to focus on the subset of genes regulated by the lead-SNPs (SNP with the most significant association) in a given 100-kb region.
Through this approach, we identified 11 trans-eQTL hotspots (number of unique genes > 30) in saline conditions, but none in normal conditions (Supplementary Fig. S6; Supplementary Table 10).These results indicate that gene regulation under stress conditions may be dependent on a few master-regulators, as has been shown for drought stress in rice and maize (Kuroha et al., 2017;Liu et al., 2020).Interestingly, one of the hotspots, Chr7: 25.9-26.0Mb,influenced the expression of a disproportionately high number of genes (191 genes) and contains the gene OsFLP, a R2R3 myb-like transcription factor that regulates stomatal development (Wu et al., 2019).OsFLP has recently been associated with salt tolerance in rice (Qu et al., 2022).
It has been shown previously that when a gene is regulated via both cis and trans factors, their effects tend to drive target gene expression in opposite directions, canceling the combined effect on expression.This is commonly referred to as cis-trans compensation, which is thought to arise due to stabilizing selection that maintains similar gene expression over evolutionary timescales in the face of new mutations (Coolon et al., 2014;Goncalves et al., 2012;Landry et al., 2005;Lovell et al., 2018).Contrary to this expectation, we found that most genes appear to have a reinforcing (cis and trans effect in the same direction) rather than a compensatory pattern (two-tailed proportion z-test P = 1.65 x 10 -11 ; Fig. 5a; Supplementary Table 11).Among the 524 cis-trans co-occurring genes in saline conditions, 317 were reinforcing (~60.5%) and 207 were compensatory (39.5%).This pattern held for normal conditions as well (Supplementary Fig. S7a; Supplementary Table 11).Thus, expression patterns of these genes are expected to either increase or decrease expression over evolutionary timescales, indicating the presence of directional selection.
We hypothesized that this excess of reinforcing rather than compensatory effects could be driven by the extensive diversity among rice landraces that arose from crop diversification following domestication (Meyer and Purugganan, 2013).To test this hypothesis, we examined whether genes under cis-trans reinforcement showed evidence of higher inter-varietal variation in gene expression (population-wide variance between mean accession expression levels) as compared to genes under cis-trans compensation.Supporting our hypothesis, we found significantly higher inter-varietal expression variation for genes under cis-trans reinforcement in saline (Fig. 5b; Mann-Whitney U-test = 0.002; mean-inter-varietal expression variation compensating = 1.12, reinforcing = 1.27) and in normal (Supplementary Fig. S7b) conditions.Furthermore, it has been suggested that cis-trans compensation/reinforcement can arise due to the genetic fixation of compensating/reinforcing trans-regulatory variants (McManus et al., 2014;Signor and Nuzhdin, 2018), which could lead to elevated linkage disequilibrium (LD) between the cis-and trans-regulatory variants acting on a gene.This was indeed the case: estimated LD between all pairs of cis-and trans-regulatory variants for the identified compensating and reinforcing genes was significantly higher as compared to the background LD (mean r 2 compensating: 0.33, reinforcing: 0.36, background = 0.056; two-tailed permutation test To examine the pattern of past selection acting on cis-and trans-eQTLs (before, during and after rice domestication, including the crop diversification phase) we plotted the folded sitefrequency spectrum (SFS) of the minor allele frequencies and inferred the relative strength of historic selection (Joly-Lopez et al., 2020) We found that trans-eQTLs for both environments had been under strong purifying selection, with the SFS being significantly left-shifted compared to background SNPs (two-tailed t-test P < 10 -16 ; mean MAF background = 0.20, trans normal = 0.0936, trans saline = 0.0910).Not surprisingly, given their highly pleiotropic nature, this effect was more pronounced for trans-eQTLs regulating multiple genes (two-tailed t-test P < 10 -16 ; normal mean MAF in unique trans-eQTLs = 0.099, multiple trans-eQTL = 0.073; saline mean MAF in unique trans-eQTLs = 0.097, multiple trans-eQTL = 0.071).In contrast to trans-eQTLs, we found that cis-eQTLs in both normal and saline conditions had potentially been under balancing selection (Fig. 5c, Supplementary Fig. S8) with the SFS being significantly rightshifted for cis-eQTLs relative to background SNPs (two-tailed t-test P < 10 -16 ; mean MAF background = 0.20, cis normal = 0.238, cis saline = 0.237).Moreover, for both conditions, the estimated nucleotide diversity (π) for 100-kb regions flanking cis-eQTLs was also significantly higher as compared to genome-wide 100-kb blocks (two-tailed t-test P < 10 -11 ; mean π background = 0.286, cis normal = 0.297, cis saline = 0.297).
Comparing patterns of past selection for cis-eQTLs specific to each condition revealed no significant difference between the normal and saline environments.However, we did observe trans-eQTLs in saline conditions to be under stronger purifying selection (two-tailed t-test P = 0.017).Taken together, this indicates that cis-and trans-eQTLs are under different selection regimes in rice, and that purifying selection is stronger on trans-eQTLs regulating gene expression under salt stress conditions, giving us insights into the macroevolutionary dynamics of gene-expression variation.

DISCUSSION
Gene expression is a key link in the chain of adaptive organismal responses of organisms to environmental challenges.In this study, we used a systems genomics approach to examine genome-wide transcript levels in rice in both normal and moderately saline field conditions to examine the evolutionary response and genetic architecture of gene expression under salinity stress.We found selection on expression of a set of genes, and for some genes, selection on expression differed among environments, indicating that salinity can select for changes in gene expression.We saw that the genetic regulatory pathways can be modified by salinity, as indicated by decoherence, which provides a potential genetic mechanism underlying salinity tolerance.We found limited evidence for trade-offs given a lack of antagonistic pleiotropy in the fitness effects of expression compared across environments, and we also found that there did not appear to be strong cis-trans compensation in gene regulation.These results provide us with insights into the fitness consequences and genetic architecture of gene expression under salt stress in rice, as we discuss below.
One of the primary goals of this study was to characterize selection on gene expression and how this selection varies between saline and normal conditions.Before discussing our results, it is important to note that our conclusions are influenced by the way in which selection differentials are standardized.Selection differentials are estimated through regression coefficients in the relationship between trait values and fitness (Lande 1979).Because traits are often measured on different scales, selection differentials are usually standardized to allow for meaningful comparisons and interpretation.The most common method of standardization is variance standardization, in which traits are standardized to a mean of zero and a standard deviation of one (Lande and Arnold, 1983).With this approach, the selection differential, multiplied by the heritability, gives the expected response to selection in standard deviation units.So while this is a valid approach, there is concern that when variance-standardized selection differentials are interpreted as reflecting the "strength" of selection based on their magnitude, they can be misleading because trait variance can vary among traits and environments, such that the trait variance and the magnitude of selection are conflated (Hereford et al., 2004).Instead, the use of mean-standardized selection estimates have been suggested (Hereford et al., 2004;Matsumura et al., 2012).However, it has not been widely recognized that mean-standardized selection estimates can also be biased if mean trait values vary consistently across conditions, such that magnitude of selection is conflated with trait means.We show here that this was indeed the case: the mean gene expression values differed consistently between saline and normal conditions, leading to biases when comparing the strength of selection across treatments when mean-standardized selection differentials were used.Specifically, using meanstandardized selection coefficients, we found selection on gene expression to be stronger under normal conditions, whereas unstandardized and variance-standardized selection coefficients indicated stronger selection under saline conditions.This result indicates that it is important for researchers to carefully consider the advantages and potential drawbacks of both variancestandardized and mean-standardized, as well as unstandardized, selection differentials in making decisions about which to use in any particular system and depending of the goals of the study, but also to consider these ideas in interpreting selection differentials as reflecting the strength of selection.
Because we found that both variance-standardized and mean-standardized selection gradients were biased, as explained above, we draw our conclusions about patterns of selection using unstandardized selection gradients.We believe this approach to be appropriate given that gene expression values are measured on the same scale and normalized for every transcript, and our goal is to identify transcripts under selection and compare selection across treatments.Using unstandardized selection coefficients, our study shows that most genes appear to have nearly neutral levels (|S| < 0.1) of selection on transcript levels in both normal and saline environments.This is consistent with previous studies conducted in various plant and non-plant species, which have shown that most traits (including transcript abundance) are under very weak selection, and only a few traits experience strong selection at microevolutionary timescales (Ahmad et al., 2021;Hoekstra et al., 2001;Kingsolver et al., 2001).There does seem to be a tendency, however, towards stronger positive selection over negative selection on gene expression, meaning that higher expression of a majority of genes is generally associated with higher fitness.
Looking at quadratic selection, we found evidence for both stabilizing and disruptive selection on gene expression.Our results are consistent with the expectation of stabilizing selection being more common than disruptive selection in normal conditions (Kingsolver et al., 2001), and reinforce the idea that disruptive selection is more widespread under stress conditions (Ahmad et al., 2021), potentially reflecting the prevalence of frequency-and density-dependent competition for resources under stress (Kingsolver and Pfennig, 2007).
Comparing the distribution of selection coefficients, we further found that selection was stronger in moderate stress saline field conditions compared to normal wet paddy field conditions, indicating that salinity stress induced increased selective pressure on gene expression at microevolutionary timescales.Although exposure to salinity stress increased the strength of selection on gene expression, this increase was relatively low compared to what has been reported for drought stress in rice (Agrawal and Whitlock, 2010;Groen et al., 2020;Kondrashov and Houle, 1997).This could potentially be attributed to the levels of stress experienced by the plantsthe drought stress treatment was more severe in terms of fitness after stress exposure (Agrawal and Whitlock, 2010;Groen et al., 2020;Kondrashov and Houle, 1997) in comparison to the moderate salinity stress in this studybut further work is needed to examine whether there is indeed a relationship between stress intensity and selection strength.While this work provides insight into the direction of selection and the predicted degree of evolutionary change under the specific conditions of this study, more work would need to be done over multiple generations and under additional conditions to make more realistic predictions of microevolutionary change in the field.
Our work contributes to an ongoing debate regarding whether the intensity of selection on gene expression increases under stress.We found in this study of rice and salinity, and in a prior study on drought (Groen et al. 2020), that directional selection was greater under stress compared to normal conditions.This finding is consistent with prior studies on other systems like fruit flies (Drosophila melanogaster) (Groen et al., 2020;Jasnos et al., 2008;Kondrashov and Houle, 1997) and yeast (Groen et al., 2020;Jasnos et al., 2008;Kondrashov and Houle, 1997), but contrasts with a recent study in D. melanogaster reporting no change in selection due to increasing levels of nutritional stress (Arbuthnott and Whitlock, 2018).Our results also support prior findings in rice (Ćalić et al., 2022;Groen et al., 2020) and in Boechera stricta (Anderson et al., 2013) that antagonistic pleiotropy occurs and can be important for local adaptation, while conditional neutrality in gene expression may be much more common.Our finding indicates that it may be feasible to breed salinity tolerant rice varieties without a yield penalty under non-saline conditions, though further work is needed to verify this.
We examined the genetic architecture of gene expression variation as well, and found that trans-eQTLs rather than cis-eQTLs are primarily associated with rice gene expression under salinity stress, potentially via a few master-regulators.Trans-eQTLs may be important in environment-dependent gene expression changes, while cis-eQTLs may be more robust to environmental changes as has been observed in other species (Smith and Kruglyak, 2008;Snoek et al., 2012;Sterken et al., 2023).This can be attributed to trans-eQTLs' larger mutational target along with a larger effect of drift than of positive selection in fixing them.These results further corroborates with other studies that have found a more dominant effect of trans-eQTLs on within-species variation than between-species variation (Emerson et al., 2010;Metzger et al., 2016;Wittkopp et al., 2008).

We found that cis-and trans-eQTLs show different patterns of selection, with cis-eQTLs
showing evidence for balancing selection and trans-eQTLs showing purifying selection.We see this pattern for rice, a largely selfing species, in contrast to outcrossing species where both cisand trans-eQTLs have been found to be under negative selection (Hernandez et al., 2019;Josephs et al., 2015).Future studies will be needed to investigate whether the mating system has an influence on the pattern of natural selection acting on variants regulating gene expression.
Finally, and contrary to expectations, we show that under saline conditions, cis-trans reinforcement is more prevalent than cis-trans compensation.This result may be driven by rice domestication and subsequent population diversification.Additionally, we find significantly elevated levels of LD among the cis-and trans-regulatory variants for compensating and reinforcing genes, indicating the role of genetic fixation as an underlying mechanism for cistrans compensation/reinforcement.Systems genomic approaches provide insights into large-scale patterns across genomewide data over multiple scales.This approach can also identify possible genes or genetic pathways that may prove critical in various processes.Our analysis, for example, shows that many of the potential antagonistically pleiotropic genes are involved in photosynthesis and metabolic processes.Importantly, we found that the circadian rhythm pathway was under positive selection (selection for increased expression) in saline conditions and that the genes involved in circadian rhythm showed significant decoherence between the two environments.
These findings make sense in light of the fact that circadian rhythm has been shown to regulate salt tolerance and flowering time in rice (Liang et al., 2021;Wei et al., 2021), so responses to salt stress may alter the expression patterns of genes in the circadian rhythm network, leading to decoherence.In further support of this idea of a link between salt tolerance and circadian rhythm, our selection analyses on physiological traits shows that salt stress leads to selection for earlier flowering.Additionally, our decoherence analyses identified transcripts in important pathways related to plant responses and potentially tolerance to excess salts, including the carbon metabolic pathways (glycolysis and tricarboxylic acid cycle), and accumulation of secondary metabolites and sugar moieties with osmoprotectant properties (Jacoby et al., 2011;Krishnamurthy and Bhagwat, 1990;Petrusa and Winicov, 1997;Tari et al., 2010;Yang and Guo, 2018).
Other genes that are beneficial only in the saline environment include a cyclophilinencoding transcript (OsCYP2), which has been shown to confer salt tolerance in rice (Ruan et al., 2011).Moreover, we also identified eQTL variants regulating expression of two photosynthesis related antagonistically pleiotropic transcripts (PSAN and CRR7) in the normal environment.
Finally, our trans-eQTL analysis identified a hotspot on chromosome 7 that contains OsFLP, a R2R3 myb-like transcription factor that regulates stomatal development (Wu et al., 2019), and appears to be involved in salt tolerance in rice (Qu et al., 2022).Together, these loci are possible new targets for functional and translational studies of salinity stress in rice.Coupled with the insights gained by a systems genomic approach in inferring large scale patterns from genomewide information, this integration of data across multiple scales across a rice population has allowed us to provide an integrated examination of the molecular and genetic landscape underlying adaptive plant salinity stress responses.

MATERIALS AND METHODS
Plant Material.Domesticated rice is primarily classified into two distinct genetic subgroups, O. sativa ssp.indica and O. sativa ssp.japonica.These subgroups are grown in sympatry and are often recognized as subspecies, given the reproductive barriers between them (Nadir et al., 2018).Further analyses have identified a widely accepted classification consisting of genetically distinct varietal groups namely, indica, aus/circum-aus, aromatic/circum-basmati, tropical japonica, and temperate japonica (Garris et al., 2005).For this study, a total of 130 O. sativa ssp.
indica (including indica and circum-aus groups) and 65 O. sativa ssp.japonica (including circum-basmati, tropical japonica, and temperate japonica) accessions were selected, including traditional varieties/landraces and three additionally replicated salt-sensitive and -tolerant test varieties.We focused our analyses on O. sativa ssp.indica since it is the predominant global varietal group (Supplementary Table 12).Seeds were obtained from the International Rice Genebank Collection (IRGC) at the International Rice Research Institute (IRRI) in the Philippines, and from a 2016 bulk seed collection obtained from plants grown under normal (wet and non-saline) conditions at IRRI during the course of a previous study (Groen et al., 2020).
Within each field environment, there were three blocks and three replicates of each genotype (accession), with each genotype planted once per block in a random location.Each plant was planted in a single-row with 0.2-m × 0.2-m spacing between them for a total of one focal plant and seven neighboring plants (included in the experiment) per plot.Each experimental plot included the accessions NSIC Rc 222 and NSIC Rc 182 that served as border rows (Supplemental Fig. S9; Supplemental Table S13).The application of salt in site L5 started on January 19, 2017, when the plants were 31 days old.The salinity level was monitored by recording electrical conductivity (EC), using EC meters installed in each of the parcels at a depth of 30 cm.The EC levels were recorded twice per day until reaching an EC=6 dSm -1 and then recorded daily.The salinity levels were then maintained at 6 dSm -1 (considered mild to moderate salinity stress) until maturity.Management and maintenance of the fields included the application of basal fertilizer, spraying of insecticides against thrips and removal of plants potentially infected with rice tungro virus.Tissue collection for transcriptome sequencing.Leaf sampling was performed as previously described (Groen et al., 2020).Briefly, leaf collection in the non-saline and saline field was done from 10:00h to 12:00h at 38 DAS (8 days after the beginning of the salt treatment) in the nonsaline and saline field from 10:00h to 12:00h.Both fields were sampled simultaneously and with individuals within a block collected in the same order.For each sample, about 10 cm of leaf length were cut into small pieces and placed in chilled 5-mL tubes containing 4 mL of RNALater (Fisher Scientific) solution for RNA stabilization and storage.Leaf samples from each of the 5-ml tubes were then transferred into pairs of 2-mL tubes (one for processing and one for backup), then stored at − 80 °C.
Yield harvesting and panicle trait phenotyping.A total of 780 plants were harvested individually and labeled such that the yield of all plants used for each type of measurement (mRNA sequencing, phenotypic measurements) was known.Individual seeds collected were further categorized as filled, partially filled, and unfilled, using manual assessment and a seed counter (Hoffman Manufacturing).Panicle length measurements and panicle trait phenotyping (PTRAP) of 30 seeds were also performed.

Functional trait phenotyping.
In addition to yield-related measurements, we collected data on a number of physiological, morphological, and phenological traits to assess differences between rice accessions in response to soil salinity.In both the non-saline and saline fields, we recorded leaf osmotic potential (LOP) in the vegetative stage, performed chlorophyll analysis based on 1mg leaf samples and measured ion content (analysis of sodium, Na + , and potassium, K + , analysis) based on 20-mg of leaf samples in both the vegetative and reproductive stages.We also measured plant height for growth rate (measured once a week until maturity).Flowering time was recorded as the day on which 50% of plants in a plot flowered; these plants included the focal plant and its seven neighboring plants.Whole plants were both harvested at the vegetative stage and at maturity to measure wet and dry biomass.
Extraction of total RNA for library construction.Leaf samples stored at −80 °C were thawed at room temperature briefly and excess RNALater was removed.Tissue samples were then flashfrozen in liquid nitrogen and ground using a TissueLyser II (Qiagen).After this, total RNA was extracted using the RNeasy Plant Mini Kit according to manufacturer's protocol (Qiagen) and eluted in nuclease-free water.The integrity of total RNA was assessed by agarose gel electrophoresis, and RNA from a random subset of samples was further assessed by Agilent TapeStation (Agilent Technologies).RNA concentration was quantified on a Qubit (Invitrogen).
Samples were stored at −80 °C until library preparation.
RNA-seq library preparation and sequencing.Library preparation for 780 samples was performed as described in Groen et al. (2020) and followed a plate-based 3′-end mRNA sequencing (3′ mRNA-seq) protocol.Briefly, total RNA from each sample was transferred individually into 96-well plates and normalized to a concentration of 10 ng in 50 μL nucleasefree water.Then, mRNA samples were reverse-transcribed using Superscript II Reverse Transcriptase (Thermo Fisher Scientific) and cDNAs were amplified using the Smart-seq2 protocol (Picelli et al., 2013) with modifications (Groen et al., 2020).This resulted in multiplexed pools of 96 samples each that were used for library preparation with the Nextera XT DNA sample prep kit (Illumina), returning 3′-biased cDNA fragments, similar to the Drop-seq protocol (Macosko et al., 2015).The resulting cDNA libraries were then quantified on an Agilent BioAnalyzer and sequenced at the NYU Genomics Core on an Illumina NextSeq 500 with the configuration HighOutput 1 x 75 base pairs (bp) and the settings: Read 1 of 20 bp (bases 1-12, well barcode; bases 13-20, unique molecular identifier (UMI)) and Read 2 of 50 bp.Raw sequence reads have been submitted to the SRA (BioProject PRJNA1010833).
RNA-seq data processing and data normalization.3′ mRNA-seq read data were processed as previously described in Groen et al. (2020).Briefly, Drop-seq tools v1.12 (https://github.com/broadinstitute/Drop-seq)and Picard tools v2.9.0 (https://broadinstitute.github.io/picard/)were used to generate the metadata.The reference genome, Nipponbare IRGSP 1.0 (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/001/433/935/GCF_001433935.1_IRGSP-1.0), and annotations were indexed with STAR v020201 (Dobin et al., 2013).Prior to generating read counts, raw reads were converted from FASTQ to unaligned BAM format using Picard tools FastqToSam before being processed using the unified script for a FASTQ starting format.After this, digital gene-expression matrices displaying either UMI or raw read counts with transcripts as rows and samples as columns containing counts from reads with 96 expected sample barcodes were produced using the DigitalExpression utility.Sample barcodes corresponding to beads never exposed to rice total RNA were filtered out based on low numbers of transcribed elements as described previously (Groen et al., 2020;Macosko et al., 2015).Rice individuals that ended up being discarded due to low numbers of transcribed elements, were sequenced again using another library.UMI counts per sample were normalized through dividing by the total number of detected UMIs in that sample and multiplying by 1 × 10 6 to obtain transcripts per million.The resulting data matrices were then merged into one digital gene-expression super-matrix, containing transcripts-per-million expression data for all samples.Elements with very low transcription levels (transcript models with a sigma signal < 20) were discarded, after which a robust normalization was conducted using an invariant set normalization protocol within the DChip utility v2010.01(Li and Wong, 2001).All downstream analyses were done in log-space, using normalized expression levels (log 2 [normalized transcripts-per-million value + 1]) of transcribed elements estimated using R v3.4.3 (Computing, 2013;Robinson et al., 2010).In a final step of filtering, transcripts that were not detected in at least 10% of individuals across our populations and did not derive from protein-coding genes on nuclear chromosomes were removed prior to performing subsequent analyses.
Quantitative genetics of fecundity and gene expression.The effect of genotype (G), environment (E) and genotype-by-environment (G×E) on fecundity (number of filled grains) was assessed using two-way analysis of variance (ANOVA) with E as a fixed effect, and G and G×E as random effects.The significance of each term was determined using the F-tests (fixed effect) and likelihood ratio test (random effects).Fecundity, averaged by genotype, was further compared between the environments using a two-tailed paired t-test.Variation in gene expression was partitioned similarly using the same model as above, and significance of each term was tested using F-tests via a mixed-model ANOVA (Howell, 1997).Multiple testing was controlled using a False-Discovery Rate (FDR) of 0.001.Broad-sense heritabilities were estimated as H 2 = σ 2 G /( σ 2 G + σ 2 GE /e + σ 2 E /re), with σ 2 G , σ 2 E and σ 2 GE as the variance explained due to G, E, and G×E; e and r represent the number of environments and number of replicates per environment, respectively.Inter-varietal differences in gene expression was estimated for each environmental condition as the population-wide variance between accession mean expression levels.
Univariate and multivariate selection analyses.Univariate selection differentials consider each trait separately and represent the total strength of selection acting on a trait (Lande, 1979;Lande and Arnold, 1983).Fitness was estimated as fecundity, which was the total number of filled rice grains per individual.Unstandardized linear selection differentials (S) on gene expression were estimated as coefficients of linear regression with fecundity as the dependent variable in each regression and transcript abundance and block as the independent variables.Unstandardized quadratic selection differentials (C) were estimated similarly as twice the coefficients of quadratic regression for transcript abundance and fecundity (Conner and Hartl, 2004;Stinchcombe et al., 2008).Using these we then estimated the variance-standardized, and meanstandardized selection differentials (Hereford et al., 2004;Lande and Arnold, 1983).Data preparation included filtering out individuals with zero fecundity followed by normalizing fecundity fitness by mean fitness.Further, transcript abundances were standardized by subtracting population mean abundance of the transcript and dividing by the standard deviation (SD) of the same transcript.To satisfy normality and remove noise inherent in expression data, transcripts with expression values more than 3 standard deviations from the mean were removed, which affected fewer than 1% of individuals.Selection differentials were estimated for transcripts that were expressed in at least 20 individuals and were estimated separately in each of the two environments.Multiple testing was controlled using Bonferroni correction (Bland and Altman, 1995).
Multivariate selection gradients represent the strength of direct selection acting on each trait, after removing indirect selection caused by correlations with other traits (Lande and Arnold, 1983).Principal component analysis (PCA) was performed on transcript abundance using the prcomp function in R (Computing, 2013;Kassambara, 2017) and PCs explaining over 0.5% of variance in each environment were chosen for multivariate selection analyses.Linear (β) and quadratic () selection gradients were estimated as coefficients of multiple regression with the normalized fecundity fitness as the dependent variable and the PCs as the independent variables (Conner and Hartl, 2004).We then chose the top one percent of transcripts showing the highest loadings of the PCs (n=182) and counted the number of transcripts showing evidence for positive selection (same directionality between loading of the transcript on the PC and univariate selection differential (S) acting on the transcript estimated above) versus negative selection (opposite directionality between loading of the transcript on the PC and univariate selection differential (S) acting on the transcript estimated above) and based on the majority assigned a directionality to the selection gradients on PCs.Variance-standardized multivariate selection gradients were estimated for functional traits without strong correlation to avoid collinearity among traits (Lande and Arnold, 1983;(Presotto et al., 2019), using a Pearson correlation coefficient < 0.6 as threshold, which was estimated using the cor function in R (Computing, 2013).
Selection analyses on gene ontology biological processes.Gene Ontology (GO) term annotations for rice genes/transcripts were downloaded from Monocots PLAZA 5.0 (Van Bel et al., 2022).All fourth-level biological-process terms were downloaded using GO.db v3.15.0 (Carlson, 2022) and only these terms were considered for further analyses.Next, terms with fewer than 20 transcripts in our dataset were filtered out to minimize redundancy, leaving a total of 670 terms and 10,235 associated transcripts.The selection strength on a biological-process term was estimated as the median selection strength of all transcripts annotated with that term.A term was considered to be under significantly stronger selection compared to the transcriptomewide median if the median strength of selection for a term was over the transcriptome-wide median selection strength by at least the 95% confidence interval for the selection strength of that term.GO enrichments were done using ShinyGO (Ge et al., 2020).
Regulatory decoherence analyses.To examine regulatory decoherence in rice, a recently developed method, CILP (Correlation by Individual Level Product), was used (Lea et al., 2019).
Since CILP calculates product correlations for all possible pairs of genes, only transcripts with a selection strength greater than 0.1 (|S| > 0.1) in at least one environment with expression greater than 0 in at least 50% individuals were included to reduce the dimensionality (leaving 2,318 transcripts).Multiple testing was controlled using a False-Discovery Rate (FDR) of 0.05.
Genotype data and SNP calling.Raw FASTQ files were downloaded from the Sequence Read Archive (SRA) website under BioProject PRJEB6180 (Wang et al., 2018) and under bioprojects PRJNA422249 and PRJNA557122 for 92 accessions (Gutaker et al., 2020) (Supplementary Table 12).Further, genomes of 19 accessions were re-sequenced, and submitted to the SRA (BioProject PRJNA1012700), leading to a total of 125 accessions for which genomic data was available (Supplementary Table 12).

G-matrix estimation and prediction of short-term phenotypic evolution. A G-matrix (G)
representing the additive genetic variance and covariance was estimated for the principal component axes (PCs) by taking the eigengene.This was done by deploying GREML v1.94 (Yang et al., 2011).Although the principal components are by definition uncorrelated at the level of the individual replicate plants, they start showing genetic covariances when loading values of replicates from each genotype are averaged.Next, using the multivariate breeder's equation (Δ z = G β), we predicted the response to short-term phenotypic selection (Δz) on the PCs.
Association mapping.Association mapping was performed between the SNP markers and gene expression values recorded in the normal and saline environments.For this, the linear model in Matrix eQTL was used (Shabalin, 2012).The normalized gene expression values were averaged over the replicates in each environment separately and these averages were subsequently used to test for associations.The first five principal components (PCs) of the kinship matrix were estimated using GAPIT v3 (Wang and Zhang, 2021) and added as covariates to control for population structure.Associations were considered significant at a false-discovery rate (FDR) < 0.001, and when significant were included in downstream analyses.Due to long stretches of homozygosity attributed to the highly inbred nature of rice, trans-eQTLs were defined as being on a different chromosome or at least 1 Mb away from a gene under its influence on the same chromosome; cis-eQTLs were defined as <100 kb away from an associated gene.
To identify significant G × eQTLs, we ran Matrix eQTL on the difference of expression in the normal and saline conditions (normalsaline) at FDR 0.05 (Smith and Kruglyak, 2008).
For each gene regulated by a SNP in cis or trans, a lead SNP was identified as the SNP with the most significant association within a 100-kb region.Furthermore, trans-eQTL hotspots were identified through analyzing the number of unique genes regulated by lead SNPs in a given 100-kb region.To detect genes under cis-trans compensation or reinforcement, the effect sizes of all lead trans-eQTLs were averaged for each gene and compared to the lead cis-eQTL for the same gene.Next, for these genes we estimated the mean proportion of individuals with opposite and same direction cis-trans allelic configuration.Genes were defined as compensating and reinforcing if they had at least 60% of individuals with opposite and same cis-trans allelic configuration, respectively.To examine the LD structure, we estimated r 2 using plink v1.9 (Purcell et al., 2007) between (1) all pairs of cis-and trans-variants for the identified compensating and reinforcing genes and (2) 1,000 datasets of randomly selected SNP pairs (with equal numbers of variant pairs and a similar distribution of distances between the cis-and transvariants as in 1).Next, we compared r 2 between (1) and ( 2) using a two-tailed permutation test.
To examine the patterns of past selection on eQTLs, we used the minor allele frequency (MAF) of the 246,714 SNP markers and compared these using the t.test function in R (Computing, 2013).Furthermore, we estimated the site-wise nucleotide diversity (π) and averaged it over 50-kb flanking regions around each cis-eQTLs (100-kb region total).We compared this π to the background nucleotide diversity, estimated as π averaged over 100-kb blocks throughout the genome minus the 100-kb cis-eQTL region above.The difference in mean of nucleotide diversity was tested using the t.test function in R (Computing, 2013).

Figure 2 :
Figure 2: Biological processes and pathways with differential responses to selection under saline conditions.a, GO biological processes under stronger selection in normal (blue) and saline conditions (pink).b, Linear selection gradients (β), along with direct (D), indirect (I) and total (T) responses to selection on suites of transcripts in normal (blue) and saline conditions (pink).

Figure 3 :
Figure 3: Salinity stress induces regulatory decoherence.a, Pearson correlation coefficients between pairs of transcripts (|S| > 0.1 and expression greater than 0 in at least 50% individuals) in normal (x-axis) and saline conditions (y-axis).Pink and blue represent pairs with correlation stronger in saline and normal conditions, respectively; gray represents correlation that is not significantly different between conditions.b, Enrichment of transcripts with significant pairs greater than the median (median significant pair per transcript = 12, n = 853) involved in regulatory decoherence post salt exposure.

Figure 4 :
Figure 4: Traits with different selection profiles under salt stress.Linear (β) and quadratic (γ) selection gradients on the traits LOP (leaf osmotic potential), Chl_a (chlorophyll a content), and FT (flowering time).Error bars represent mean ± SE; dots and asterisks indicate significance of selection-gradient at two-sided unadjusted P < 0.1 and P < 0.05.

Figure 5 :
Figure 5: Genetic architecture of gene expression variation during salt stress.a, Effect sizes of genes with both cis and trans factors under saline conditions showing excess of reinforcing cis-trans (teal) in comparison to compensating cis-trans (salmon).b, Inter-varietal variation in gene expression for genes under compensating control is significantly lower than for those under reinforcing control; one-sided Mann-Whitney P = 0.0028.c, Frequency distribution of MAF (minor allele frequency) for cis-eQTLs (blue) and trans-eQTLs (yellow) in saline conditions against the genome-wide background (gray).
was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.It is made